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Steinhaus'  Geometric  Location  Problem  for 
Random  Samples  in  the  Plane 


Dor it  Hochbaum  and  J.  Michael  Steele 


I.  Introduction. 

-  '  The  work  of  H.  Steinhaus  .(iAwf^s  apparently  the  first  explicit 

treatment  of  the  natural  question  "How  should  one  choose  n  points  from 
a  mass  distributed  in  the  plane  so  as  to  best  represent  the  whole?"  The 
main  objective  of  this  article  is  to  treat  a  stochastic  analogue  of 
Steinhaus*  problem. 

One  principle  motivation  for  this  stochastic  analogue  comes  from 
developments  in  the  theory  of  algorithms.  The  first  of  these  is  the 
discovery  of  Karp  an  efficient  probabilistic  algorithm  for 

solving  the  traveling  salesman  problem.  The  second  development  was  the 
proof  of  Papadlmitriou  UAQfffot  the  conjecture  of  Fisher  and  Hochbaum 
that  the  "Euclidean  k-median  location  problem"  is  NP-complete. 


More  will  be  said  about  these  algorithmic  considerations  in  a  late: 

2 

section,  but  we  should  first  state  our  results.  For  any  x^R  and 
Integers  k  and  n  we  define 


n 

(1.1)  M(k;  x.,x_ . x  )  -  min  £  min  ||x  -x  j 

1  2  n  S:  S  -k  i-1  jeS  J 


(Here  the  minimum  is  over  all  SC  (x^x^ . . . ,xfl}  such  that  S  has 
cardinality  |s|  "  k). 

In  words,  one  views  S  as  a  set  of  k  "centers,"  and  each  "site" 
x^  is  served  by  its  nearest  center  x^eS  at  a  cost  equal  to  llxj~xjll 
The  quantity  M(k;x1,x2,...,xn)  is  therefore  the  minimal  cost 


1 


attainable  by  an  optimal  choice  of  k  centers.  This  language  is 
chosen  in  sympathy  with  the  applications  which  have  been  suggested  in 
Bollobas  (1973),  Oomuejols,  Fisher,  Nemhauser  (1978),  and  Starret 
(1974). 

Our  main  result  is  the  following: 

2 

Theorem  1.  If  X^,  1  £  i  <  °°,  are  independent  and  uniform  on  [0,1]  , 
then  for  any  0  <  a  <  1  one  has 

(1.2)  lim  n“1/2M((nct];  X.  ,X,,...,X  )  -  C 

.  „  l  L  n  a 

n  -*■  00 

with  probability  one  for  some  constant  0  <  CQ  <  °°  . 

For  reasons  which  will  be  discussed  in  the  last  section  it  will  be 
useful  to  generalize  this  result  slightly.  For  any  1  £  p  <  00  we 
define 

n 

(1.3)  M  (k;x.,x2,...,x  )  -  min  l  min  ||x  -x.  |J  P  . 

p  S:|s|-k  i-1  j£S  1  J 

Under  the  same  hypotheses  as  in  Theorem  1  we  will  prove 
Theorem  2.  For  any  1  £  p  <  2  and  0  <  o<  1  we  have  with  probability 
one 

lim  M  ([an];X1,X. . Xj/n1~p/2  -  c  „ 

n  -v  oo  P  12  n  a»P 

for  some  constant  0  <  C  <  ®. 

a,p 
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Naturally,  It  will  suffice  to  prove  just  Theorem  2,  and  this  proof 
will  occupy  the  next  three  sections.  In  section  two  we  concentrate  on  the 
key  combinatorial  observations  which  will  make  the  theorem  possible. 

Section  three  contains  a  Tauberlan  argument  which  partially  parallels  that 
used  in  Steele  (1980b) ,  but  with  the  significant  difference  that  the  present 
problem  lacks  the  monotonicity  previously  relied  on  in  the  theory  of  sub- 
additive  Euclidean  functionals  (Steele  1980a, b) .  The  elementary  Lemma  3.4 
on  the  "differentiation"  of  an  asymptotic  series  is  one  of  the  devices  intro¬ 
duced  here  which  may  prove  useful  in  other  non-monotone  problems. 

The  proof  of  Theorem  2  is  completed  in  Section  4  with  the  help  of 
the  Efron-Stein  Jackknife  Inequality  teamed  with  the  combinatorial  lemmas 
of  Section  2  and  classical  arguments. 

The  last  section  briefly  discusses  the  algorithmic  application  of 
this  result.  We  also  discuss  the  extension  of  our  results  to  non- 
unlformly  distributed  random  variables,  and  comment  briefly  on  some  open 
problems . 
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II.  Combinatorial  and  Geometric  Lemmas 

Most  of  the  observations  which  are  special  to  the  location  problem 
are  contained  in  this  section,  but  one  can  find  some  hidden  generality 
since  most  subadditive  Euclidean  functionals  have  properties  analogous 
to  those  that  follow. 

2 

Lemma  2.1.  There  is  a  constant  8  ■  Bp  such  that  for  all  x^tO.l]  , 

1  £  i  n,  and  k  2 

(2.1)  M  (k;x,  ,x-,...,x  )  <  g  n  k’p^  . 

p  l  z  n  — 

2  2 

Proof .  Let  an  integer  s  be  chosen  so  that  s  <_  k  £  (s+1)  ,  and 
2  2  -1 

divide  [0,1]  into  s  cells  of  side  s  .  For  each  occupied  cell 

2 

C.  choose  one  element  x! e  C .  and  set  S'  »  (x ! :  1  <  i  <  s  }.  Since 
1  II  i  —  — 

each  cell  has  diameter  /2s”  and  since  |s'|  <_  k,  we  have  the  generous 
bound 

M  (k;x.,x ,,..., x)  <  n(/2  s'1/  <  2p/2  n(k1/2-l)  ? 
p  x  i  n  —  — 

which  yields  the  lemma. 

The  next  lemma  will  provide  the  combinatorial  lynch  pin  required  in 
our  variance  bounds. 

2 

Lemma  2.2.  There  is  a  constant  0'  ■  $'  such  that  for  all  x . e  H  , 

P  1 

1  £  i  <  n,  and  k  >  2  the  difference 

M  (k;x. ,x.,..i ,x  )  •  M  (k+lj x- ,x. , . . . ,x  )  —  A 
p  i  i  n  p  i  <  np 
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safisfles  the  bounds 


(2.2) 


0  <  A  <  e*nk-1"p/2  . 

~  P  “ 


Proof.  Without  loss  we  can  assume  that  all  of  the  ||xj-Xj||p  are  dif¬ 
ferent.  For  any  set  of  optimal  centers  S  associated  with 

M  (k+l;x..  . . x  )  we  define  for  each  1  e  S  the  set 

p  l  i  n 

N(l)  -  {j;  llxj-Xjjl  <  ilXj-xJI  ,  Vke  S,  k  i  i)  . 

This  gives  the  representation 


M  (k+l;x.,x  , , 
P  12 


•>*„>  "  l  l 

11  J  i 


ies  jeN(i) 


lvxj 


We  now  note  that 


(2.3)  #{ieS:  \  ||x.-x  |)p  >  4M  (kfl;x.  ,x  , . . .  ,x  )/(k+l)}  <  (k+l)/4 

jeN(l)  1  J  p  z 


and  also  that 


(2.4)  #{icS: |N(i)|  >  4n/(k+l)}  <  (k+l)/4  . 


This  implies  that  for  H  defined  by 

H  -  {1GS:  7  ||x,-x,||p  <  4gn(kfl)”p/2/ (kfl)  and|N(l)l  <  4n/(kfl)} 

3cN(i)  1  3 

We  have  by  (2.3),  (2.4),  and  Lemma  2.1  that  #H  >  (k+l)/2  . 
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2  2 

This  time  we  choose  s  so  that  s  <  (k+l)/2  <_  (s+1)  and  again 
2  2  -1 

devide  [0,1]  into  s  cells  of  side  s  .  Some  cell  must  contain 

two  elements  of  H,  say  x,  and  x  . 

1  2 

We  now  define  a  suboptimal  choice  for  the  k  centers  by  using 

S'  ■  Sn(x  }.  We  continue  to  serve  all  U  (x. :  j  eN(i)}  as  before 

11  i»‘i1  J 

but  now  serve  the  elements  of  {x^ :  jeN(i^)}  by  x^^  .  The  cost  of 

serving  (x  :JeN(i  )}  by  x  is 
J  1  x2 

(2.5)  l  ||x  -x  ||P  <  l  2P(  ||x  -x  HP+||x  -x  ||p) 

jeN(ix)  J  2  jeN(i1)  J  X1  n  x2 

<  2p{4gn(k+l)‘p/2-1+ (/2  s_1)P(4n)/(k+l)} 


For  k  2  the  last  expression  can  be  bounded  by  B'nk"1-p^  ;  and, 
since  (2.5)  majorizes  ,  this  proves  the  second  inequality  in  (2.2). 
The  first  inequality  is  trivial,  so  the  lemma  is  complete. 

One  often  finds  it  useful  to  know  that  in  an  optimal  allocation  no 
site  is  very  far  from  a  center. 

Lemma  2.3.  There  is  a  constant  0"  -  0"  with  the  property  that  for 

P 

any  S  with  ]s|  ■  k  which  minimizes  E”  .  min  ||x  -x  ||p  we  have 

IrC  1  j 


5  =  $  (k.x.  . . x  )  =  max  min  ||x.-x  ||p  <  0"  n  k“1_p^2 

p  p  12  n  1  <  1<  n  ieS  1  J  p 


Proof .  If  we  take  S’  ■  S  U  (x'}  where  x'  is  the  element  of 
(x2,x2, . . . ,xn)  farthest  from  any  element  of  S  then  we  have 


M  (fcfljx. ,x  , . . . ,x  )  <  M  (k;x.,x  , ...,x  )  -  6  . 

p  l  Z  n  —  piz  n  p 

By  Lemma  2.2  we  then  see  that  6  <  3'nk  ^  p^,  so  we  can  take  8"  *  S'. 

P  — 

The  next  lemma  will  be  useful  In  applying  Jackknife  methods  to  obtain 

A 

variance  bounds.  By  the  notation  x^  we  mean  that  x^  is  to  be  omitted 

from  the  sample,  thus  {x, ,x,, . . . ,x  , . . . ,x  }  -  {x, ,x_, . . . ,x.  . ,x.., . x  }. 

i  z  l  n  i  4  i“i  iTi  n 

Also,  we  let  l(y  £  5)  be  one  or  zero  accordingly  as  y  £  6  holds  or 
not. 

Lemma  2.4.  Setting  6-6  (k;x,  ,x,, . . . ,x  )  and  m.  -  min  ||x  -x  ||p  , 

P  1  2  n  i  j  i 

we  have 


(2.6)  Mp(k;xltx2, 


...,xn)  £  Mp(k;xltx2 


,x)+2p6  l  ld|x 


-x 


i 


llp<« 


and 


(2.7)  M  (k;x, ,x  , . . . ,x  )  <  M  (k;x1(x0,. 
p  12  n  —  p  ii 


c  , ...,x  )  +  2  (m  +6) 
In  1 


Proof.  To  prove  the  first  inequality  let  S  be  an  optimal  choice  of  k 

centers  for  (x.  ,x,,, . . .  ,x  }.  If  x.  t  S  the  inequality  is  trivial,  so 
x  i  n  i 

suppose  x^cS.  We  now  choose  a  sub-optimal  set  S'  of  k  centers  for 

lx, ,x_, . . . ,x. , . . . ,x  }.  First  let 
12  i  n 


N  -  (Xj!  ||xj-x±||  <  ||  Xj-X±t  l]  ,  T  i'es,  i'  ¥  i)  . 
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If  N  ^  {xj}  take  any  *'eN‘{^}  and  set  S'  -  S^Cxj)  U  {x'},  but 
If  N  ■  {x^}  just  take  any  x' e  (x^.Xj, . . . .x^ . . .,xq)  and  define  S' 
as  before. 

A 

We  now  serve  fxi»x2,***,xi****,xa  by  S'  as  follows.  If 
N  ■  {xj}  then  each  point  is  served  by  the  same  center  as  it  was  served 
by  in  S.  If  N  4  {x^  then  the  elements  of  N'-fx^}  are  served  by  x' , 
and  the  others  are  served  as  before.  The  cost  of  this  sub-optimal  choice 
is  bounded  by 


M  (k;x1  ,x_, . 
P  1  z 


•xn> 


n 

♦  I 

J-l 


IXj-x 


|| p  l(Xj  e  N-  {x±}  ) 


n 

<M  (k;x.,x  ,...,x  )  +  2P5  £ 

P  1  2  °  j-1 


1(B  xj-xil|P 5. 5)  . 


The  second  inequality  (2.7)  is  easier.  Let  S  be  an  optimal  choice 

of  centers  for  {x. ,x_, . . . . . x  }  and  note  that  x.  can  be  served 

x  i  l  n  i 

by  an  element  of  S  at  a  cost  leBS  than 


2P(  min  ||x.-x.  ||p  +  max  min  ||x  -x ||P)  <  2p(m. +  5)  .  | 

j:jj*i  1  j  j  :j**i  teS  C  3  -  1 
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III.  Regular  Expectations. 


For  brevity  we  set  M  ■  M  ([an] ;X- ,Xot . . . ,X  ).  Our  first  objective 

n  p  i  ^  n 

is  to  show  that  EM^  'v  cn^2.  The  method  begins  as  in  the  classical  approach 
taken  by  Beardwood,  Halton,  and  Hammers ley  (1959)  in  the  study  of  the 
traveling  salesman  problem.  As  noted  in  the  introduction,  the  main  novelty 
here  is  due  to  the  necessity  of  overcoming  the  fact  that  Mn  fails  to 
have  the  monotonicity  £  Mr+^.  The  impact  of  this  non-monotonicity  is 
even  more  strongly  felt  in  the  next  section.  (The  desire  to  understand  a 
subadditive  Euclidean  functional  which  failed  to  be  monotone  provided  the 
second  principle  motivating  this  work.) 

2 

We  now  let  II  denote  a  Poisson  point  process  on  H  .  For  any 
2 

Borel  Ac  %  >  n(A)  will  consist  of  a  set  of  NA  points  uniformly 

distributed  in  A,  where  N.  is  itself  a  Poisson  random  variable  with 

A 

mean  X(A),  the  Lebsegue  measure  of  A. 

Lemma  3.1.  Let  A  -  [0,t]2  and  set  <f>(t)  **  EM  ([oN  ];  H(A)),  then  for 

P  A 

all  integers  m  £  1  we  have 

(3.1)  4>(t)  <  m2<J>(t/m)  • 

2 

Proof.  Let  A  be  divided  into  m  cells  Qi  of  side  t/m,  then  by  the 
suboptimality  of  local  optimization  we  have 

2 

m 

Mp([aNA];  n(A))  <  MpUoHp  ]; -no^))  . 
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On  taking  expectations  and  using  the  homogeneity  of  II  we  obtain 
(3.1). 


Lemma  3.2.  If  <|>(t)  is  any  continuous  function  which  satisfied  (3.1) 
for  all  m  then 


(3.2) 

lim 

£  -V  OO 

4>(t)/t2 

“  lim  inf  4>(t) / 12  =  C  ' 

t  ^  00  a»P 

Proof . 

By  the  continuity  of 

<f>(t)  and  the  definition 

C  = 

a,p 

lim  inf 

£  -►  00 

<P(t)/t2 

we  can  choose  an  interval  (a,b)  such  that 

(3.3) 

<Kt)/t2  <  C  +  e 

—  a,p 

for  all  t£  (a,b).  By  (3.1)  we  can  conclude  that  also  we  have  (3.3) 

00 

for  all  te  U  (ma,mb).  Since  I  =  (ma,mb)  and  I  .  intersect  for 
,  in  m+i 

m~l  ^ 

oo 

all  m^a(b-a),  we  see  {J  (ma,mb)  contains  (m  a,°°);  and,  therefore, 

m»l 

2 

lim  sup  <Kt)/t  £  C  +  e  , 
t  ->■  ®  “*P 

which  completes  the  proof. 

Lemma  3.3.  For  1  <_  p  <  4  we  have  for  x  t  1  that 

I  (EM  )xn  n.  C  r(2-p/2)/(l-x)2"l>/2 
n  ot,p 


(3.4) 


(3.5) 


l  E>L  *  C  (2-p/2)~  n 
k-1  *p 


-1  2-p/2 


Proof .  Calculating  4>(t)  by  conditioning  and  making  a  change  of  scale 
from  [0,t)2  to  fO, 1] 2  shows  that  (3.2)  can  be  written  out  as 


(3.6) 


4>(t )  -  l  tp(EMn)e_t  t2n/n!  ^  C  t"* 
n=l  ’ 


By  changing  variables  t  ■  u  we  see  that 


(3.7) 


T  (EM  )e“u  un/n!  ^  C  u 

nil  n  “*p 


l-p/2 


Now  by  the  Abelian  theorem  for  Borel  summability  (e.g.  Doetsch  (1943) ,  p.  191) 
and  the  fact  that  l-p/2  >  -1  we  have  as  x  -*■  1  that 


00 

.8)  l  (EM  -EM  C  n  r(2-p/2)/(l-x)1'p/2  . 

,  n  n— x  a.p 

n*l 


Multiplying  by  (1-x)  then  completes  the  proof  of  (3.4).  Since 
EMn  0  we  8et  (3.5)  from  (3.4)  by  an  Immediate  application  of  the 
Karamata  Tauberian  theorem  (Feller  (1971),  p.  447). 

We  would  now  like  to  "differentiate"  (3.5)  in  order  to  obtain  the 
asymptotics  of  EMr.  Fortunately,  the  next  lenxna  shows  that  this  is 
(just  barely)  legitimate. 

2  y  Y-2 

Lemma  3.4.  If  £  m,  \  c  n1  for  y  >  1  and  m.  .  >  m.  -Bk'  for 


some  B  and  all  k  >  1  then 


i 


(3.9) 


m  ^  cy  nY_1  . 
n 

Proof.  Let  y  >  1  be  chosen  and  note  that 

k  _  yn  k  . 

I  ®v  1  I  <“-B  l  jY“  )  >  n(y-l)m  -B  J  £  jY“ 

n  £  k  £  yn  n  £  k  £  yn  j«n  k-n  j*n 

Y 

Dividing  by  n  and  using  the  Euler-Maclaurin  summation  formula  to  handle 
the  double  sum  gives 

(yY-l)c  >  (y-1)  limsup(m  /nY  1)  -  (y-1)  1{a_1(yY-l)-(y-l)}  . 

—  n 

Next  dividing  by  y-1  and  letting  y  +  1  shows 

Y-1 

Yc  >  lim  sup  (m  /n'  )  . 

—  n 

In  a  completely  analogous  way  one  can  show  that  llmsup(mn/nY  L)  1  Yc 

by  estimating  the  sum  £  m^  where  y  <  1  | 

yu  ^  k  ^  n 

The  next  lemma  is  the  main  consequence  of  this  section 

Lemma  3.5.  For  1  <  p  <  2  we  have  EM  ^  C  nl-p/2  as  n  -*■  ®. 

—  n  o,p 

Proof.  We  already  have  EM^  CQ  p(2-p/2)""^  n2“P/2  by  Lemma  3. A 

with  1  <  y  "  2-p/2  it  suffices  to  show 

(3 . 10)  EM^  >  EM^  -  Brp/2 
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2.4  (with  sample  size  n+1  and 


for  some 


A 


A 


n+1 


(3.11) 


B  and  all  4.  By  Lemma 
)  and  by  Lemma  2.2  (if  [(4+l)a]  >  [La])  we  have 

4+1 


«4 


>M,-2P6  l 
j“l 


l(I|xj-X1iIP  <  6)  -  ([  (4+l)al -t (4a)  ])A,p 


Here  by  Lemma  2.3, 

A  <  B'4[4aJ_1"p/2. 
P  — 


6  <  B"  4[4a]  ^  P^2;  and  by  Lemma  2.2, 

—  P 

By  elementary  estimates  we  then  see  that 


4+1 

E  l  1(||  X.-X  ||  P  <  6) 
j-1  J  1 

is  bounded,  so  taking  expectations  in  (3.11)  yields  (3.10).  | 


i 

! 


I 

I 

I 


I 

( 

i 


! 


IV.  Completion  of  the  Proof. 

The  results  of  the  preceedlng  sections  will  now  be  brought  together 

to  prove  Theorem  2.  The  only  new  tool  required  is  the  recent  result  of 

Efron-Stein  (1979)  which  says  that  the  Jackknife  estimate  of  variance  is 

positively  biased.  Explicitly,  we  first  suppose  that  S(x, ,x. . x  .) 

1  Z  n— 1 

is  any  symmetric  function  of  n-1  vectors  x^.  Kor  each  i  we  set 
a  In 

S ,  m  S (x_ ,x_,...,x,..., x  )  and  also  set  S  ■  E.  .  S,.  If  X .  are 
i  i  <  1  n  .  n  i»l  i  i 

any  independent  and  identically  distributed  random  vectors,  the  Efron-Stein 
Jackknife  inequality  says  that 


(4.1)  Var  S(X1,X . X^  .)  <  E  7  (S  -S  V  . 

^  *  n*J.  '  ^  * 


We  will  now  apply  this  inequality  with  the  aid  of  the  combinatorial 
bounds  of  Section  2. 

Lemma  4.1.  If  X^,  1  <_  i  <  00  are  independent  and  uniformly  distributed 
2 

on  [0,1]  ,  then  for  a  constant  C  not  depending  on  n  we  have 


(4.2)  Var  M  ([cm] ;  X,  ,X, . X  )  =  Var  M  <  Cn1^. 

p  i.  i  n  n  — 

Proof.  We  first  note  that  if  S.  is  replaced  by  any  other  variable, 
the  right  side  of  (4.1)  is  only  increased.  Using  (4.1)  and  Lemma  2.4 
we  now  calculate  (with  <5  -  6p([na];X1,X2,...,Xn+1)): 

n+l 

(4.3)  Var  <  E  ^  (Mp([om]  ;X^,X2»  • . .  ,X^, . . .  ,Xn^)-Mp( [an]  ;X^,X2,  •  • .  ^^j)) 

n+l  n+l  2 

<E  l  (2P6  l  1(||X  -X  ||P<$)  +  2P5  +  2P  min  ||x  -X  ||p)  . 

i-1  j-1  3  1  J:j*i  3  1 
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Replacing  6  by  $"n[ctn]  *  ■  p  will  by  Lemma  2.3  only  Increase  the 

P  n 

right  side,  so  using  Vinogradov's  symbol  to  ignore  irrelevant  constants 
we  have 


(4.4) 


Now, 


since 


P2  «  n“p, 
n  ’ 


it  is  an  elementary  calculation  to  show 


11  O  llTA  I/O  O 

E(  l  1(||  X  -X  ||  P  <  p  ))2  «  E(  l  1(|| X  -X  ||  <  n~1/2))2  «  1 

j-1  3  n  j-1  J  1  “ 


and 


E  min  I|X.-X  |[2p  «  n~P  . 

j  :  j/1  J  1 

These  bounds  and  (4.4)  imply  (4.2)  and  complete  the  lemma. 

We  now  note  that  Var(M  /n^  P^2)  «  1/n  and  that  this  bound  is  not 

n 

sharp  enough  to  automatically  imply  complete  convergence.  It  is  therefore 
necessary  to  resort  to  a  subsequence  and  maximal  argument  to  prove 
Theorem  2. 

By  the  bound  (4.2),  Lemma  (3.5),  the  Borel-Cantelli  lemma  and 
Chebyshev's  inequality  one  can  easily  show 

(4.3)  lim  M  /n*”p^2  -  C  a.s. 

^  ^  n,  k  a,p 

nk 

for  the  subsequence  n^  -  [k^J  for  any  y  >  1* 


.:■■***» 
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We  now  set  D, 


[M  /n*”p^2-M  /n*~p^2|  and 


nk^n  <  nk+l 


nfc  k 


that  to  complete  the  proof  of  the  theorem  It  suffices  to  show  D^  -*■  0 

00  2 

a.s.  For  this  it  certainly  suffices  to  show  EE  D.<  °°. 

k-1  K 


We  set  a  -  |m  .  /  (n+l)^~P^2-M  /n1  p/^2|  and 


we  ocl  —  in  ,  -  /  lUTi  j  —  n  /u  auu  uuuc 

n  1  n+1  n  1 

a  «  |M  ,,-M  |/n*  P^2  +  M  /n2  p^2.  By  Lemma  2.1  we  have  M  «  n^  p^2, 
n  1  n+1  n1  n  n 

and  if  [a(n+l>]  *  [an]  the  same  estimates  used  in  (4.4)  will  show 

E(v  -M  )2  <<  n-p.  If  [a(n+l)3  *  [an]+l  we  first  note  M  .  <  M . 

»i<rX  n  n-t-x  n 

Now  we  can  also  check  that  cannot  be  much  bigger  than 

Setting  k  -  [an]  we  have  by  Lemma  2.2  that 


Mn  <  M(k+1;  X1,X2,...,Xn)  +  g'nk"1_p/2 
and  by  Lemma  (2.4)  (and  2.3)  that 

n 

M(k+1;  X1,X2,...,Xn)  <  M(k+1;  X^X,, . VXn+l}  +  ^  ^a||Xi-Xn+1||P  <  6) 

where  6  =  gp(n+l) (k+1)  ^~p^2.  Together  these  bounds  and  elementary  calcula¬ 
tions  show  in  the  case  that  [a(n+l)]  -  [an] + 1  that  one  again  has 

E(M  .,-M  )2  «  n  P,  and  hence  Ea2  <<  1/n2. 
n+1  n  n 

The  final  calculation  is  that 


ED  <  E(E  a  )2  <  (n  -n  )(E  Ea2)  «  kY_1  E  n-2  «  k~2  . 


where  the  three  sums  are  each  over  the  range  n,  <  n  <  n,  . . 

k  —  k+1 


r  2 

This  verifies  E  l  D,  <  ®  and  completes  the  proof  of  Theorem  2, 

k-1  “ 

except  for  verifying  that  indeed  Qm  >  0.  To  show  this  last  fact 

®»P 


we  set  Z  ■  £  l(Jx  -X  ||  £  S//n)  and  note  that  easy  calculations 

l<i<j<n  1  J 
show  ”  ~ 

2 

(4.6)  E  Z  8  rr/2  as  n  +  00  ,  and 

n 

(4.7)  Var  Z  •*>  0  as  n  -*•  “  . 

n 

Since  M  is  the  sum  of  n-[no]  elements  of  S  -  (||x  -x  ||P:  l<i<j<n) 
n  i  j  •• 

we  have 

-1/2  P 

(4.8)  Mn  >  (8n  A,‘)  •  l(nZn<  (n-fna])/2)  •  (n-[na])/2  . 

Taking  expectations  in  (4.8)  we  have 

(4.9)  np/2_1  EMq  >  gP  p(Zn<  d^/2)  .  (l_ot)/2  . 

2 

For  8  <  (l-oO/v  equations  (4.6),  (4.7)  and  Chebyshev's  inequality  will 
suffice  to  show  the  right  side  of  (4.9)  is  bounded  away  from  0.  This 
shows  Ca  >  0  and  completes  the  proof. 
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V.  Algorithmic  Implications. 

The  fact  that  the  K-median  problem  has  been  proved  by  Papadlmltrlou 
(1980)  to  be  NP-hard  means  that  It  Is  extremely  unlikely  that  there  Is 
an  efficient  algorithm  for  calculating  the  optimal  choice  of  an  centers 
from  n  sites  (c.f.  Karp  (1972)).  Therefore,  since  the  K-median  problem 
occurs  in  a  variety  of  practical  contexts,  it  seems  quite  desirable  to 
find  efficient  algorithms  which  are  capable  of  providing  approximate 
optimal  center  selections. 

The  results  of  this  article  take  a  step  toward  this  by  providing  an 
estimate  for  the  value  of  an  optimal  selection.  This  value  can  be  used 
in  the  construction  of  approximately  optimal  probabilistic  algorithms 
for  the  K-median  in  a  manner  which  is  completely  parallel  to  the  way  the 
asymptotic  optimal  value  provided  by  Beardwood,  Halton,  and  Hanmersley 
(1959)  has  been  used  by  Karp  (1977)  in  the  study  of  the  Traveling 
Salesman  problem.  One  algorithm  of  this  type  for  the  K-median  problem 
(but  with  K  <  logn)  has  already  been  constructed  in  Fisher  and  Hochbaum 
(1979). 
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VI.  Concluding  Remarks  and  Open  Problems. 

One  of  the  motives  for  investigating  the  functional 

n 

M([cmJ;X  ,X? . X .)  -  min  J  min  ||x.-X.  ||p 

p  S:|s|-[na]  i-1  jeS  1  3 

for  general  p  is  the  trite  observation  that  as  p  ■+•  »  we  have 

-*■  min  max  min  1 1 X X  II  *  H 

f  S :  |  S  | -[not]  l<i<n  jcS  1  3  n 

The  J'V.-  ct local  is  of  independent  interest  and  it  was  hoped  that  the 

present  11*1  hods  might  throw  some  light  on  its  probabilistic  behavior. 

-1/2 

We  now  believe  that  n  Hn  converges  in  distribution,  but  we  have  no 

ides  >iow  this  might  be  proved.  Since  our  methods  seem  to  require  1  £  p  <  2 

and  are  more  pertinent  to  strong  laws,  an  entirely  new  technique  may  be  needed. 

There  are  also  basic  open  problems  directly  concerning 

M  ■  M  ([an] ;X. ,X_, . . . ,X  ).  In  particular,  it  seems  almost  certain  that 
n  p  x  4  n 

a  result  analogous  to  our  Theorem  2  must  hold  when  the  X^  are  Independent, 

Identically  distributed,  and  bounded.  The  methods  used  in  Steele  (1980a, c) 

seem  to  fail  to  help  in  the  location  problem  because  of  the  difficulty  of 

establishing  the  intermediate  result  for  step  densities. 

Finally,  there  is  the  question  of  determining  C  .  This  is  usually 

a.P 

hopeless,  but  perhaps  not  in  this  case.  Fejes-Toth  (1959)  was  able  to 
determine  the  analogous  constant  in  the  original  Steinhaus  problem,  and 
McClure  (1976)  has  been  able  to  extend  the  work  of  Fejes-T^th  to  other 
functionals  and  extremal  problems. 
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